Research Article | Open Access
Volume 2026 |Article ID 100042 | https://doi.org/10.1016/j.bidere.2025.100042

DeepCodon: A deep learning codon-optimization model to enhance protein expression

Xudong Han,1,2,5 Xiaotong Shao,1,2,5 Shuo Liu,1,2 Zhenkun Shi,2,3 Rong Huang,1,2 Huanyu Chu,2,4 Hejian Zhang,1,2 Ruoyu Wang,2,3 Haoran Li,2,3 Xiaoping Liao,2,3 Jian Cheng,2,4 Huifeng Jiang 2,4

1College of Biotechnology, Tianjin University of Science and Technology, Tianjin 300457, PR China
2Key Laboratory of Engineering Biology for Low-carbon Manufacturing, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, PR China
3Biodesign Center, Tianjin Institute of Industrial Biotechnology Chinese Academy of Sciences, Tianjin 300308, PR China
4National Center of Technology Innovation for Synthetic Biology, Tianjin 300308, PR China
5These authors contributed equally to this work

Received 
23 May 2025
Accepted 
28 Jul 2025
Published
28 Aug 2025

Abstract

Codon optimization enhances heterologous gene expression by modulating synonymous codon usage, a critical task in genetic engineering and synthetic biology. Achieving optimal expression requires balancing multiple interdependent factors, such as host codon bias, GC content and mRNA secondary structure, turning optimization into a challenging multiobjective problem. Here, we introduce DeepCodon, a novel deep learning tool focused on preserving functionally important rare codon clusters, which are often overlooked in previous methods. Using Escherichia coli as the host species for gene expression, a protein-CDS translation model was first trained on 1.5 million natural Enterobacteriaceae sequences and then fine-tuned with highly expressed genes. To protect functionally important rare codon clusters, we integrated a conditional probability strategy that preserves conserved rare codons. Compared with conventional approaches, DeepCodon generates sequences that better match host preferences, achieves superior in silico metrics and maintains critical rare codons. Experimental validation of seven low-yield P450s and thirteen AI-designed G3PDHs in E. coli revealed that DeepCodon outperformed traditional methods in nine cases. These results demonstrate DeepCodon's potential as a practical solution for codon optimization.

Contact us

Lucy Wang, info@biodesignresearch.com, +86 177 0518 5080
5 Tongwei Road, Xuanwu District, Nanjing, Jiangsu Province, China

© 2019-2023 BioDesign Research. All rights Reserved.  ISSN 2693-1257.

Back to top